Overview

Dataset statistics

Number of variables19
Number of observations104768
Missing cells292984
Missing cells (%)14.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.0 MiB
Average record size in memory160.0 B

Variable types

Numeric10
Categorical7
DateTime1
Unsupported1

Alerts

Store is highly overall correlated with PromoIntervalHigh correlation
CompetitionOpenSinceMonth is highly overall correlated with PromoIntervalHigh correlation
CompetitionOpenSinceYear is highly overall correlated with PromoIntervalHigh correlation
Promo2SinceWeek is highly overall correlated with StoreType and 2 other fieldsHigh correlation
Promo2SinceYear is highly overall correlated with StoreType and 1 other fieldsHigh correlation
DayOfWeek is highly overall correlated with OpenHigh correlation
Sales is highly overall correlated with Customers and 3 other fieldsHigh correlation
Customers is highly overall correlated with Sales and 2 other fieldsHigh correlation
Normalized_Sales is highly overall correlated with Sales and 3 other fieldsHigh correlation
StoreType is highly overall correlated with Promo2SinceWeek and 1 other fieldsHigh correlation
Promo2 is highly overall correlated with Promo2SinceWeek and 2 other fieldsHigh correlation
PromoInterval is highly overall correlated with Store and 4 other fieldsHigh correlation
Open is highly overall correlated with DayOfWeek and 3 other fieldsHigh correlation
Promo is highly overall correlated with Sales and 1 other fieldsHigh correlation
CompetitionOpenSinceMonth has 30902 (29.5%) missing valuesMissing
CompetitionOpenSinceYear has 30902 (29.5%) missing valuesMissing
Promo2SinceWeek has 77060 (73.6%) missing valuesMissing
Promo2SinceYear has 77060 (73.6%) missing valuesMissing
PromoInterval has 77060 (73.6%) missing valuesMissing
StateHoliday is an unsupported type, check if it needs cleaning or further analysisUnsupported
Sales has 16585 (15.8%) zerosZeros
Customers has 16585 (15.8%) zerosZeros
Normalized_Sales has 16585 (15.8%) zerosZeros

Reproduction

Analysis started2023-11-14 12:09:51.034638
Analysis finished2023-11-14 12:10:08.132053
Duration17.1 seconds
Software versionydata-profiling vv4.6.1
Download configurationconfig.json

Variables

Store
Real number (ℝ)

HIGH CORRELATION 

Distinct112
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean568.68557
Minimum4
Maximum1114
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2023-11-14T13:10:08.515642image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile64
Q1368
median544
Q3787
95-th percentile1089
Maximum1114
Range1110
Interquartile range (IQR)419

Descriptive statistics

Standard deviation290.14323
Coefficient of variation (CV)0.51019973
Kurtosis-0.77179221
Mean568.68557
Median Absolute Deviation (MAD)211
Skewness0.07022797
Sum59580050
Variance84183.091
MonotonicityIncreasing
2023-11-14T13:10:08.655971image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 942
 
0.9%
679 942
 
0.9%
755 942
 
0.9%
733 942
 
0.9%
729 942
 
0.9%
726 942
 
0.9%
722 942
 
0.9%
713 942
 
0.9%
709 942
 
0.9%
704 942
 
0.9%
Other values (102) 95348
91.0%
ValueCountFrequency (%)
4 942
0.9%
25 942
0.9%
35 942
0.9%
42 942
0.9%
57 942
0.9%
64 942
0.9%
84 942
0.9%
104 942
0.9%
125 942
0.9%
157 942
0.9%
ValueCountFrequency (%)
1114 942
0.9%
1112 942
0.9%
1101 942
0.9%
1097 942
0.9%
1092 758
0.7%
1089 942
0.9%
1075 942
0.9%
1066 942
0.9%
1033 942
0.9%
1027 758
0.7%

StoreType
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
a
61620 
d
20540 
c
14130 
b
8478 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters104768
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc
2nd rowc
3rd rowc
4th rowc
5th rowc

Common Values

ValueCountFrequency (%)
a 61620
58.8%
d 20540
 
19.6%
c 14130
 
13.5%
b 8478
 
8.1%

Length

2023-11-14T13:10:08.776263image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-14T13:10:08.872690image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
a 61620
58.8%
d 20540
 
19.6%
c 14130
 
13.5%
b 8478
 
8.1%

Most occurring characters

ValueCountFrequency (%)
a 61620
58.8%
d 20540
 
19.6%
c 14130
 
13.5%
b 8478
 
8.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 104768
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 61620
58.8%
d 20540
 
19.6%
c 14130
 
13.5%
b 8478
 
8.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 104768
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 61620
58.8%
d 20540
 
19.6%
c 14130
 
13.5%
b 8478
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 61620
58.8%
d 20540
 
19.6%
c 14130
 
13.5%
b 8478
 
8.1%

Assortment
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
c
61620 
a
39380 
b
 
3768

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters104768
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc
2nd rowc
3rd rowc
4th rowc
5th rowc

Common Values

ValueCountFrequency (%)
c 61620
58.8%
a 39380
37.6%
b 3768
 
3.6%

Length

2023-11-14T13:10:08.982060image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-14T13:10:09.081114image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
c 61620
58.8%
a 39380
37.6%
b 3768
 
3.6%

Most occurring characters

ValueCountFrequency (%)
c 61620
58.8%
a 39380
37.6%
b 3768
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 104768
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 61620
58.8%
a 39380
37.6%
b 3768
 
3.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 104768
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 61620
58.8%
a 39380
37.6%
b 3768
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 61620
58.8%
a 39380
37.6%
b 3768
 
3.6%

CompetitionDistance
Real number (ℝ)

Distinct95
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4021.9645
Minimum40
Maximum40540
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2023-11-14T13:10:09.198283image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile60
Q1285
median1210
Q34060
95-th percentile22330
Maximum40540
Range40500
Interquartile range (IQR)3775

Descriptive statistics

Standard deviation7051.1133
Coefficient of variation (CV)1.7531515
Kurtosis8.6841229
Mean4021.9645
Median Absolute Deviation (MAD)1040
Skewness2.8546501
Sum4.2137318 × 108
Variance49718199
MonotonicityNot monotonic
2023-11-14T13:10:09.341090image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50 3768
 
3.6%
220 2826
 
2.7%
250 2826
 
2.7%
90 2826
 
2.7%
210 2826
 
2.7%
350 1884
 
1.8%
580 1884
 
1.8%
1910 1884
 
1.8%
140 1884
 
1.8%
120 1884
 
1.8%
Other values (85) 80276
76.6%
ValueCountFrequency (%)
40 942
 
0.9%
50 3768
3.6%
60 942
 
0.9%
80 942
 
0.9%
90 2826
2.7%
120 1884
1.8%
140 1884
1.8%
150 942
 
0.9%
170 942
 
0.9%
190 1700
1.6%
ValueCountFrequency (%)
40540 942
0.9%
33060 942
0.9%
23620 942
0.9%
23130 942
0.9%
22560 942
0.9%
22330 942
0.9%
20620 942
0.9%
20390 942
0.9%
16490 942
0.9%
15340 942
0.9%

CompetitionOpenSinceMonth
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct12
Distinct (%)< 0.1%
Missing30902
Missing (%)29.5%
Infinite0
Infinite (%)0.0%
Mean6.775729
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2023-11-14T13:10:09.466476image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median6
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.1127001
Coefficient of variation (CV)0.45938969
Kurtosis-1.1413135
Mean6.775729
Median Absolute Deviation (MAD)3
Skewness0.059514846
Sum500496
Variance9.6889017
MonotonicityNot monotonic
2023-11-14T13:10:09.575449image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
4 10362
 
9.9%
9 9420
 
9.0%
6 8294
 
7.9%
8 6594
 
6.3%
5 6594
 
6.3%
3 6594
 
6.3%
10 5652
 
5.4%
12 5652
 
5.4%
11 5468
 
5.2%
2 3768
 
3.6%
Other values (2) 5468
 
5.2%
(Missing) 30902
29.5%
ValueCountFrequency (%)
1 1884
 
1.8%
2 3768
 
3.6%
3 6594
6.3%
4 10362
9.9%
5 6594
6.3%
6 8294
7.9%
7 3584
 
3.4%
8 6594
6.3%
9 9420
9.0%
10 5652
5.4%
ValueCountFrequency (%)
12 5652
5.4%
11 5468
5.2%
10 5652
5.4%
9 9420
9.0%
8 6594
6.3%
7 3584
 
3.4%
6 8294
7.9%
5 6594
6.3%
4 10362
9.9%
3 6594
6.3%

CompetitionOpenSinceYear
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct17
Distinct (%)< 0.1%
Missing30902
Missing (%)29.5%
Infinite0
Infinite (%)0.0%
Mean2008.6601
Minimum1999
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2023-11-14T13:10:09.683364image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1999
5-th percentile2001
Q12005
median2009
Q32013
95-th percentile2014
Maximum2015
Range16
Interquartile range (IQR)8

Descriptive statistics

Standard deviation4.1633051
Coefficient of variation (CV)0.0020726778
Kurtosis-0.82778341
Mean2008.6601
Median Absolute Deviation (MAD)4
Skewness-0.39051738
Sum1.4837168 × 108
Variance17.333109
MonotonicityNot monotonic
2023-11-14T13:10:09.793930image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
2013 9420
 
9.0%
2009 8478
 
8.1%
2014 7536
 
7.2%
2005 7536
 
7.2%
2006 6594
 
6.3%
2012 5652
 
5.4%
2010 4710
 
4.5%
2008 4526
 
4.3%
2011 3768
 
3.6%
2003 2826
 
2.7%
Other values (7) 12820
12.2%
(Missing) 30902
29.5%
ValueCountFrequency (%)
1999 942
 
0.9%
2000 1700
 
1.6%
2001 1884
 
1.8%
2002 2826
 
2.7%
2003 2826
 
2.7%
2004 942
 
0.9%
2005 7536
7.2%
2006 6594
6.3%
2007 2642
 
2.5%
2008 4526
4.3%
ValueCountFrequency (%)
2015 1884
 
1.8%
2014 7536
7.2%
2013 9420
9.0%
2012 5652
5.4%
2011 3768
 
3.6%
2010 4710
4.5%
2009 8478
8.1%
2008 4526
4.3%
2007 2642
 
2.5%
2006 6594
6.3%

Promo2
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
0
77060 
1
27708 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters104768
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 77060
73.6%
1 27708
 
26.4%

Length

2023-11-14T13:10:09.913067image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-14T13:10:10.001917image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
0 77060
73.6%
1 27708
 
26.4%

Most occurring characters

ValueCountFrequency (%)
0 77060
73.6%
1 27708
 
26.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 104768
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 77060
73.6%
1 27708
 
26.4%

Most occurring scripts

ValueCountFrequency (%)
Common 104768
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 77060
73.6%
1 27708
 
26.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 77060
73.6%
1 27708
 
26.4%

Promo2SinceWeek
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct15
Distinct (%)0.1%
Missing77060
Missing (%)73.6%
Infinite0
Infinite (%)0.0%
Mean26.435037
Minimum1
Maximum48
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2023-11-14T13:10:10.093058image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q110
median31
Q340
95-th percentile48
Maximum48
Range47
Interquartile range (IQR)30

Descriptive statistics

Standard deviation15.696298
Coefficient of variation (CV)0.59376872
Kurtosis-1.5283546
Mean26.435037
Median Absolute Deviation (MAD)14
Skewness-0.24610407
Sum732462
Variance246.37377
MonotonicityNot monotonic
2023-11-14T13:10:10.201663image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
40 5100
 
4.9%
10 3768
 
3.6%
37 2826
 
2.7%
1 1884
 
1.8%
5 1884
 
1.8%
31 1884
 
1.8%
45 1884
 
1.8%
48 1884
 
1.8%
14 942
 
0.9%
39 942
 
0.9%
Other values (5) 4710
 
4.5%
(Missing) 77060
73.6%
ValueCountFrequency (%)
1 1884
1.8%
5 1884
1.8%
9 942
 
0.9%
10 3768
3.6%
13 942
 
0.9%
14 942
 
0.9%
18 942
 
0.9%
22 942
 
0.9%
31 1884
1.8%
35 942
 
0.9%
ValueCountFrequency (%)
48 1884
 
1.8%
45 1884
 
1.8%
40 5100
4.9%
39 942
 
0.9%
37 2826
2.7%
35 942
 
0.9%
31 1884
 
1.8%
22 942
 
0.9%
18 942
 
0.9%
14 942
 
0.9%

Promo2SinceYear
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6
Distinct (%)< 0.1%
Missing77060
Missing (%)73.6%
Infinite0
Infinite (%)0.0%
Mean2011.7081
Minimum2009
Maximum2014
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2023-11-14T13:10:10.301233image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum2009
5-th percentile2009
Q12010
median2012
Q32014
95-th percentile2014
Maximum2014
Range5
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.8414341
Coefficient of variation (CV)0.00091535851
Kurtosis-1.3372398
Mean2011.7081
Median Absolute Deviation (MAD)2
Skewness-0.16803939
Sum55740408
Variance3.3908797
MonotonicityNot monotonic
2023-11-14T13:10:10.407456image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2014 7168
 
6.8%
2009 5652
 
5.4%
2011 5468
 
5.2%
2012 3768
 
3.6%
2013 3768
 
3.6%
2010 1884
 
1.8%
(Missing) 77060
73.6%
ValueCountFrequency (%)
2009 5652
5.4%
2010 1884
 
1.8%
2011 5468
5.2%
2012 3768
3.6%
2013 3768
3.6%
2014 7168
6.8%
ValueCountFrequency (%)
2014 7168
6.8%
2013 3768
3.6%
2012 3768
3.6%
2011 5468
5.2%
2010 1884
 
1.8%
2009 5652
5.4%

PromoInterval
Categorical

HIGH CORRELATION  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing77060
Missing (%)73.6%
Memory size1.6 MiB
Jan,Apr,Jul,Oct
17346 
Feb,May,Aug,Nov
5652 
Mar,Jun,Sept,Dec
4710 

Length

Max length16
Median length15
Mean length15.169987
Min length15

Characters and Unicode

Total characters420330
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJan,Apr,Jul,Oct
2nd rowJan,Apr,Jul,Oct
3rd rowJan,Apr,Jul,Oct
4th rowJan,Apr,Jul,Oct
5th rowJan,Apr,Jul,Oct

Common Values

ValueCountFrequency (%)
Jan,Apr,Jul,Oct 17346
 
16.6%
Feb,May,Aug,Nov 5652
 
5.4%
Mar,Jun,Sept,Dec 4710
 
4.5%
(Missing) 77060
73.6%

Length

2023-11-14T13:10:10.527274image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-14T13:10:10.634171image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
jan,apr,jul,oct 17346
62.6%
feb,may,aug,nov 5652
 
20.4%
mar,jun,sept,dec 4710
 
17.0%

Most occurring characters

ValueCountFrequency (%)
, 83124
19.8%
J 39402
 
9.4%
u 27708
 
6.6%
a 27708
 
6.6%
A 22998
 
5.5%
c 22056
 
5.2%
t 22056
 
5.2%
r 22056
 
5.2%
p 22056
 
5.2%
n 22056
 
5.2%
Other values (13) 109110
26.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 226374
53.9%
Uppercase Letter 110832
26.4%
Other Punctuation 83124
 
19.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 27708
12.2%
a 27708
12.2%
c 22056
9.7%
t 22056
9.7%
r 22056
9.7%
p 22056
9.7%
n 22056
9.7%
l 17346
7.7%
e 15072
6.7%
b 5652
 
2.5%
Other values (4) 22608
10.0%
Uppercase Letter
ValueCountFrequency (%)
J 39402
35.6%
A 22998
20.8%
O 17346
15.7%
M 10362
 
9.3%
F 5652
 
5.1%
N 5652
 
5.1%
S 4710
 
4.2%
D 4710
 
4.2%
Other Punctuation
ValueCountFrequency (%)
, 83124
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 337206
80.2%
Common 83124
 
19.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
J 39402
11.7%
u 27708
 
8.2%
a 27708
 
8.2%
A 22998
 
6.8%
c 22056
 
6.5%
t 22056
 
6.5%
r 22056
 
6.5%
p 22056
 
6.5%
n 22056
 
6.5%
l 17346
 
5.1%
Other values (12) 91764
27.2%
Common
ValueCountFrequency (%)
, 83124
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 420330
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 83124
19.8%
J 39402
 
9.4%
u 27708
 
6.6%
a 27708
 
6.6%
A 22998
 
5.5%
c 22056
 
5.2%
t 22056
 
5.2%
r 22056
 
5.2%
p 22056
 
5.2%
n 22056
 
5.2%
Other values (13) 109110
26.0%

DayOfWeek
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.9979765
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2023-11-14T13:10:10.730083image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.9973628
Coefficient of variation (CV)0.49959344
Kurtosis-1.2469264
Mean3.9979765
Median Absolute Deviation (MAD)2
Skewness0.0020091688
Sum418860
Variance3.9894582
MonotonicityNot monotonic
2023-11-14T13:10:11.102933image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
5 15016
14.3%
4 15016
14.3%
3 15012
14.3%
2 15012
14.3%
1 14904
14.2%
7 14904
14.2%
6 14904
14.2%
ValueCountFrequency (%)
1 14904
14.2%
2 15012
14.3%
3 15012
14.3%
4 15016
14.3%
5 15016
14.3%
6 14904
14.2%
7 14904
14.2%
ValueCountFrequency (%)
7 14904
14.2%
6 14904
14.2%
5 15016
14.3%
4 15016
14.3%
3 15012
14.3%
2 15012
14.3%
1 14904
14.2%

Date
Date

Distinct942
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
Minimum2013-01-01 00:00:00
Maximum2015-07-31 00:00:00
2023-11-14T13:10:11.229497image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:11.378819image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Sales
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct17114
Distinct (%)16.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10044.839
Minimum0
Maximum38722
Zeros16585
Zeros (%)15.8%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2023-11-14T13:10:11.525577image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17878
median10324
Q313238
95-th percentile19112
Maximum38722
Range38722
Interquartile range (IQR)5360

Descriptive statistics

Standard deviation5686.5633
Coefficient of variation (CV)0.56611791
Kurtosis0.33754372
Mean10044.839
Median Absolute Deviation (MAD)2642
Skewness-0.080041153
Sum1.0523777 × 109
Variance32337002
MonotonicityNot monotonic
2023-11-14T13:10:11.675748image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 16585
 
15.8%
10428 26
 
< 0.1%
8946 25
 
< 0.1%
8311 24
 
< 0.1%
10425 24
 
< 0.1%
10385 24
 
< 0.1%
9501 24
 
< 0.1%
9169 24
 
< 0.1%
9662 24
 
< 0.1%
9189 23
 
< 0.1%
Other values (17104) 87965
84.0%
ValueCountFrequency (%)
0 16585
15.8%
1407 1
 
< 0.1%
1410 1
 
< 0.1%
2070 1
 
< 0.1%
2099 1
 
< 0.1%
2177 1
 
< 0.1%
2223 1
 
< 0.1%
2326 1
 
< 0.1%
2347 1
 
< 0.1%
2363 1
 
< 0.1%
ValueCountFrequency (%)
38722 1
< 0.1%
38484 1
< 0.1%
38367 1
< 0.1%
38037 1
< 0.1%
38025 1
< 0.1%
37646 1
< 0.1%
37403 1
< 0.1%
37376 1
< 0.1%
37122 1
< 0.1%
36417 1
< 0.1%

Customers
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3796
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1229.8038
Minimum0
Maximum7388
Zeros16585
Zeros (%)15.8%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2023-11-14T13:10:11.815320image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1838
median1177
Q31601
95-th percentile2828
Maximum7388
Range7388
Interquartile range (IQR)763

Descriptive statistics

Standard deviation811.24939
Coefficient of variation (CV)0.65965756
Kurtosis1.0621201
Mean1229.8038
Median Absolute Deviation (MAD)376
Skewness0.66055617
Sum1.2884409 × 108
Variance658125.57
MonotonicityNot monotonic
2023-11-14T13:10:11.952666image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 16585
 
15.8%
1075 175
 
0.2%
989 166
 
0.2%
1070 159
 
0.2%
1123 158
 
0.2%
1113 157
 
0.1%
1150 156
 
0.1%
1032 154
 
0.1%
1102 153
 
0.1%
1145 152
 
0.1%
Other values (3786) 86753
82.8%
ValueCountFrequency (%)
0 16585
15.8%
156 1
 
< 0.1%
197 1
 
< 0.1%
237 1
 
< 0.1%
267 1
 
< 0.1%
274 1
 
< 0.1%
276 1
 
< 0.1%
280 1
 
< 0.1%
282 1
 
< 0.1%
301 1
 
< 0.1%
ValueCountFrequency (%)
7388 1
< 0.1%
5494 1
< 0.1%
5458 1
< 0.1%
5387 1
< 0.1%
5297 1
< 0.1%
5192 1
< 0.1%
5152 1
< 0.1%
5145 1
< 0.1%
5132 1
< 0.1%
5112 1
< 0.1%

Open
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
1
88189 
0
16579 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters104768
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 88189
84.2%
0 16579
 
15.8%

Length

2023-11-14T13:10:12.080738image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-14T13:10:12.170565image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1 88189
84.2%
0 16579
 
15.8%

Most occurring characters

ValueCountFrequency (%)
1 88189
84.2%
0 16579
 
15.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 104768
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 88189
84.2%
0 16579
 
15.8%

Most occurring scripts

ValueCountFrequency (%)
Common 104768
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 88189
84.2%
0 16579
 
15.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 88189
84.2%
0 16579
 
15.8%

Promo
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
0
64744 
1
40024 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters104768
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 64744
61.8%
1 40024
38.2%

Length

2023-11-14T13:10:12.266715image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-14T13:10:12.355952image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
0 64744
61.8%
1 40024
38.2%

Most occurring characters

ValueCountFrequency (%)
0 64744
61.8%
1 40024
38.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 104768
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 64744
61.8%
1 40024
38.2%

Most occurring scripts

ValueCountFrequency (%)
Common 104768
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 64744
61.8%
1 40024
38.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 64744
61.8%
1 40024
38.2%

StateHoliday
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size1.6 MiB

SchoolHoliday
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
0
85730 
1
19038 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters104768
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 85730
81.8%
1 19038
 
18.2%

Length

2023-11-14T13:10:12.453013image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-14T13:10:12.542611image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
0 85730
81.8%
1 19038
 
18.2%

Most occurring characters

ValueCountFrequency (%)
0 85730
81.8%
1 19038
 
18.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 104768
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 85730
81.8%
1 19038
 
18.2%

Most occurring scripts

ValueCountFrequency (%)
Common 104768
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 85730
81.8%
1 19038
 
18.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 85730
81.8%
1 19038
 
18.2%

Normalized_Sales
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct17114
Distinct (%)16.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2594091
Minimum0
Maximum1
Zeros16585
Zeros (%)15.8%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2023-11-14T13:10:12.653910image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.20345024
median0.26661846
Q30.34187284
95-th percentile0.49356955
Maximum1
Range1
Interquartile range (IQR)0.1384226

Descriptive statistics

Standard deviation0.14685614
Coefficient of variation (CV)0.56611791
Kurtosis0.33754372
Mean0.2594091
Median Absolute Deviation (MAD)0.068229947
Skewness-0.080041153
Sum27177.772
Variance0.021566724
MonotonicityNot monotonic
2023-11-14T13:10:12.794573image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 16585
 
15.8%
0.2693042715 26
 
< 0.1%
0.231031455 25
 
< 0.1%
0.2146325087 24
 
< 0.1%
0.2692267961 24
 
< 0.1%
0.2681937916 24
 
< 0.1%
0.2453643923 24
 
< 0.1%
0.236790455 24
 
< 0.1%
0.2495222354 24
 
< 0.1%
0.2373069573 23
 
< 0.1%
Other values (17104) 87965
84.0%
ValueCountFrequency (%)
0 16585
15.8%
0.03633593306 1
 
< 0.1%
0.0364134084 1
 
< 0.1%
0.05345798254 1
 
< 0.1%
0.0542069108 1
 
< 0.1%
0.05622126956 1
 
< 0.1%
0.05740922473 1
 
< 0.1%
0.0600692113 1
 
< 0.1%
0.06061153866 1
 
< 0.1%
0.06102474046 1
 
< 0.1%
ValueCountFrequency (%)
1 1
< 0.1%
0.9938536233 1
< 0.1%
0.9908320851 1
< 0.1%
0.982309798 1
< 0.1%
0.9819998967 1
< 0.1%
0.9722121791 1
< 0.1%
0.9659366768 1
< 0.1%
0.9652393988 1
< 0.1%
0.9586798203 1
< 0.1%
0.9404731161 1
< 0.1%

Interactions

2023-11-14T13:10:06.235228image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:55.344179image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:56.307896image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:57.473580image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:59.210059image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:00.471199image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:01.561512image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:02.657195image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:03.718331image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:05.101120image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:06.352595image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:55.431346image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:56.412596image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:57.574431image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:59.346538image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:00.578376image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:01.685591image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:02.789484image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:04.012859image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:05.228122image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:06.480048image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:55.532702image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:56.534090image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:58.284802image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:59.516652image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:00.689937image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:01.800179image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:02.935620image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:04.148942image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:05.370462image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:06.579621image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:55.621448image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:56.639018image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:58.389255image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:59.641223image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:00.824151image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:01.896098image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:03.034545image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:04.288922image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:05.473123image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:06.678588image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:55.714191image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:56.750441image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:58.509852image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:59.748961image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:00.925622image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:01.995415image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:03.135802image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:04.427354image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:05.599396image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:06.769152image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:55.807622image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:56.841110image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:58.625854image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:59.879200image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:01.017039image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:02.095961image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:03.226140image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:04.527249image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:05.712452image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:06.867620image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:55.906505image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:56.939313image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:58.757937image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:00.027871image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:01.142271image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:02.218064image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:03.322999image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:04.628054image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:05.822253image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:06.982947image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:56.008200image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:57.056414image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:58.866942image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:00.147877image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:01.268489image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:02.346570image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:03.418818image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:04.750228image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:05.927409image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:07.105088image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:56.111828image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:57.185799image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:58.984784image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:00.266637image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:01.370330image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:02.467235image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:03.522442image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:04.878374image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:06.033544image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:07.212919image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:56.213432image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:57.332120image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:09:59.100851image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:00.371541image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:01.458262image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:02.559112image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:03.623199image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:04.988638image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-14T13:10:06.136612image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Correlations

2023-11-14T13:10:12.897133image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
StoreCompetitionDistanceCompetitionOpenSinceMonthCompetitionOpenSinceYearPromo2SinceWeekPromo2SinceYearDayOfWeekSalesCustomersNormalized_SalesStoreTypeAssortmentPromo2PromoIntervalOpenPromoSchoolHoliday
Store1.000-0.053-0.151-0.0620.1300.3000.0000.0200.0070.0200.2700.2310.2720.5670.0410.0000.000
CompetitionDistance-0.0531.000-0.1140.038-0.133-0.073-0.000-0.126-0.365-0.1260.2730.2090.2190.2260.0190.0000.000
CompetitionOpenSinceMonth-0.151-0.1141.0000.014-0.2900.0240.0000.015-0.0370.0150.3790.3950.1560.6240.0440.0000.000
CompetitionOpenSinceYear-0.0620.0380.0141.0000.092-0.047-0.0000.0650.0480.0650.2800.4160.4260.8340.0350.0000.000
Promo2SinceWeek0.130-0.133-0.2900.0921.000-0.4340.0000.0780.1580.0780.5900.4071.0000.6290.0660.0000.000
Promo2SinceYear0.300-0.0730.024-0.047-0.4341.0000.0000.000-0.0050.0000.5120.3891.0000.3620.0520.0000.000
DayOfWeek0.000-0.0000.000-0.0000.0000.0001.000-0.440-0.379-0.4400.0000.0000.0000.0000.8510.4960.273
Sales0.020-0.1260.0150.0650.0780.000-0.4401.0000.8351.0000.1360.0810.1000.0990.9930.5010.098
Customers0.007-0.365-0.0370.0480.158-0.005-0.3790.8351.0000.8350.3920.3610.0990.1840.8540.3470.088
Normalized_Sales0.020-0.1260.0150.0650.0780.000-0.4401.0000.8351.0000.1360.0810.1000.0990.9930.5010.098
StoreType0.2700.2730.3790.2800.5900.5120.0000.1360.3920.1361.0000.4820.1290.4610.1280.0000.000
Assortment0.2310.2090.3950.4160.4070.3890.0000.0810.3610.0810.4821.0000.0130.1400.0840.0000.001
Promo20.2720.2190.1560.4261.0001.0000.0000.1000.0990.1000.1290.0131.0001.0000.0050.0000.003
PromoInterval0.5670.2260.6240.8340.6290.3620.0000.0990.1840.0990.4610.1401.0001.0000.0220.0000.007
Open0.0410.0190.0440.0350.0660.0520.8510.9930.8540.9930.1280.0840.0050.0221.0000.2900.094
Promo0.0000.0000.0000.0000.0000.0000.4960.5010.3470.5010.0000.0000.0000.0000.2901.0000.071
SchoolHoliday0.0000.0000.0000.0000.0000.0000.2730.0980.0880.0980.0000.0010.0030.0070.0940.0711.000

Missing values

2023-11-14T13:10:07.367683image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-11-14T13:10:07.718894image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-11-14T13:10:08.025083image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

StoreStoreTypeAssortmentCompetitionDistanceCompetitionOpenSinceMonthCompetitionOpenSinceYearPromo2Promo2SinceWeekPromo2SinceYearPromoIntervalDayOfWeekDateSalesCustomersOpenPromoStateHolidaySchoolHolidayNormalized_Sales
28264cc620.09.02009.00NaNNaNNaN52015-07-3113995149811010.361422
28274cc620.09.02009.00NaNNaNNaN42015-07-3010387127611010.268245
28284cc620.09.02009.00NaNNaNNaN32015-07-2910514125811010.271525
28294cc620.09.02009.00NaNNaNNaN22015-07-2810275119111010.265353
28304cc620.09.02009.00NaNNaNNaN12015-07-2711812137911010.305046
28314cc620.09.02009.00NaNNaNNaN72015-07-260000000.000000
28324cc620.09.02009.00NaNNaNNaN62015-07-259322121910000.240742
28334cc620.09.02009.00NaNNaNNaN52015-07-248322110810010.214917
28344cc620.09.02009.00NaNNaNNaN42015-07-237286110110010.188162
28354cc620.09.02009.00NaNNaNNaN32015-07-228503110810010.219591
StoreStoreTypeAssortmentCompetitionDistanceCompetitionOpenSinceMonthCompetitionOpenSinceYearPromo2Promo2SinceWeekPromo2SinceYearPromoIntervalDayOfWeekDateSalesCustomersOpenPromoStateHolidaySchoolHolidayNormalized_Sales
10162571114ac870.0NaNNaN0NaNNaNNaN42013-01-1018075264111000.466789
10162581114ac870.0NaNNaN0NaNNaNNaN32013-01-0917073248111000.440912
10162591114ac870.0NaNNaN0NaNNaNNaN22013-01-0818816258811000.485925
10162601114ac870.0NaNNaN0NaNNaNNaN12013-01-0721237296211000.548448
10162611114ac870.0NaNNaN0NaNNaNNaN72013-01-060000000.000000
10162621114ac870.0NaNNaN0NaNNaNNaN62013-01-0518856306510000.486958
10162631114ac870.0NaNNaN0NaNNaNNaN52013-01-0418371303610010.474433
10162641114ac870.0NaNNaN0NaNNaNNaN42013-01-0318463321110010.476809
10162651114ac870.0NaNNaN0NaNNaNNaN32013-01-0220642340110010.533082
10162661114ac870.0NaNNaN0NaNNaNNaN22013-01-010000a10.000000